High-throughput prediction of protein antigenicity using protein microarray data

نویسندگان

  • Christophe N. Magnan
  • Michael Zeller
  • Matthew A. Kayala
  • Adam Vigil
  • Arlo Z. Randall
  • Philip L. Felgner
  • Pierre Baldi
چکیده

MOTIVATION Discovery of novel protective antigens is fundamental to the development of vaccines for existing and emerging pathogens. Most computational methods for predicting protein antigenicity rely directly on homology with previously characterized protective antigens; however, homology-based methods will fail to discover truly novel protective antigens. Thus, there is a significant need for homology-free methods capable of screening entire proteomes for the antigens most likely to generate a protective humoral immune response. RESULTS Here we begin by curating two types of positive data: (i) antigens that elicit a strong antibody response in protected individuals but not in unprotected individuals, using human immunoglobulin reactivity data obtained from protein microarray analyses; and (ii) known protective antigens from the literature. The resulting datasets are used to train a sequence-based prediction model, ANTIGENpro, to predict the likelihood that a protein is a protective antigen. ANTIGENpro correctly classifies 82% of the known protective antigens when trained using only the protein microarray datasets. The accuracy on the combined dataset is estimated at 76% by cross-validation experiments. Finally, ANTIGENpro performs well when evaluated on an external pathogen proteome for which protein microarray data were obtained after the initial development of ANTIGENpro. AVAILABILITY ANTIGENpro is integrated in the SCRATCH suite of predictors available at http://scratch.proteomics.ics.uci.edu. CONTACT [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In Silico Prediction and Docking of Tertiary Structure of Multifunctional Protein X of Hepatitis B Virus

Hepatitis B virus (HBV) infection is a universal health problem and may result into acute, fulminant, chronic hepatitis liver cirrhosis, or hepatocellular carcinoma. Sequence for protein X of HBV was retrieved from Uniprot database. ProtParam from ExPAsy server was used to investigate the physicochemical properties of the protein. Homology modeling was carried out using Phyre2 server, and refin...

متن کامل

Genome-Scale Protein Function Prediction in Yeast Saccharomyces cerevisiae Through Integrating Multiple Sources of High-Throughput Data

As we are moving into the post genome-sequencing era, various high-throughput experimental techniques have been developed to characterize biological systems at the genome scale. Discovering new biological knowledge from high-throughput biological data is a major challenge for bioinformatics today. To address this challenge, we developed a Bayesian statistical method together with Boltzmann mach...

متن کامل

Expression and Purification of Neurotoxin-Associated Protein HA-33/A from Clostridium botulinum and Evaluation of Its Antigenicity

Background: Botulinum neurotoxin (BoNT) complexes consist of neurotoxin and neurotoxin-associated proteins. Hemagglutinin-33 (HA-33) is a member of BoNT type A (BoNT/A) complex. Considering the protective role of HA-33 in preservation of BoNT/A in gastrointestinal harsh conditions and also its adjuvant role, recombinant production of this protein is favorable. Thus in this study, HA-33 was expr...

متن کامل

Using the Protein-protein Interaction Network to Identifying the Biomarkers in Evolution of the Oocyte

Background Oocyte maturity includes nuclear and cytoplasmic maturity, both of which are important for embryo fertilization. The development of oocyte is not limited to the period of follicular growth, and starts from the embryonic period and continues throughout life. In this study, for the purpose of evaluating the effect of the FSH hormone on the expression of genes, GEO access codes for this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 26 23  شماره 

صفحات  -

تاریخ انتشار 2010